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Abstract 

Objective To investigate whether discrepancies in trials of use of bone 
marrow stem cells in patients with heart disease account for the variation 
in reported effect size in improvement of left ventricular function. 

Design Identification and counting of factual discrepancies in trial reports, 
and sample size weighted regression against therapeutic effect size. 
Meta-analysis of trials that provided sufficient information. 

Data sources PubMed and Embase from inception to April 2013. 

Eligibility for selecting studies Randomised controlled trials evaluating 
the effect of autologous bone marrow stem cells for heart disease on 
mean left ventricular ejection fraction. 

Results There were over 600 discrepancies in 1 33 reports from 49 trials. 
There was a significant association between the number of discrepancies 
and the reported increment in EF with bone marrow stem cell therapy 
(Spearman's r=0.4, P=0.005). Trials with no discrepancies were a small 
minority (five trials) and showed a mean EF effect size of -0.4%. The 
24 trials with 1-10 discrepancies showed a mean effect size of 2.1%. 
The 12 with 1 1-20 discrepancies showed a mean effect of size 3.0%. 
The three with 21 -30 discrepancies showed a mean effect size of 5.7%. 
The high discrepancy group, comprising five trials with over 30 
discrepancies each, showed a mean effect size of 7.7%. 



Conclusions Avoiding discrepancies is difficult but is important because 
discrepancy count is related to effect size. The mechanism is unknown 
but should be explored in the design of future trials because in the five 
trials without discrepancies the effect of bone marrow stem cell therapy 
on ejection fraction is zero. 

Introduction 

Autologous bone marrow stem cells offer an exciting 
opportunity for improvement of left ventricular function, reverse 
remodelling, and scar size reduction' in patients with ischaemic 
heart disease." Results, however, have been conflicting. The 
reason for the differences between the various trials of effect 
on left ventricular function has so far not been identified. 
Meta-analyses have confirmed a significant positive effect on 
average but have found no clear explanation for the conflicts 
between individual trials.' ^ 

It has recently been discovered that some pioneering trials of 
autologous bone marrow stem cells have unexplained 
discrepancies that cast doubt on their validity.^ It was not 
possible to report this directly in the journals that published the 
trials.' Discrepancies in reports have never been systematically 
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explored as a possible explanatory variable for the effect size 
of autologous bone marrow stem cells on ejection fraction. 

We examined reports of the randomised controlled trials of bone 
marrow stem cell therapy for discrepancies of design, methods, 
or results and examine the relation between number of 
discrepancies and effect sizes reported. We defined a 
discrepancy as two (or more) reported facts that cannot both be 
true because they are logically or mathematically incompatible. 

Methods 

Search strategy and eligibility criteria 

We searched Embase and PubMed (1966 to April 2013), using 
the following search strategy: ("bone marrow cell" OR "bone 
marrow cells" OR "stem cells" OR "stem cell" OR "progenitor 
cell" OR "progenitor cells") AND ("myocardial infarction" OR 
"coronary artery disease" OR cardiomyopathy OR "heart 
failure") AND random*. 

We also manually searched citation lists' ^ and PubMed hnks 
to related citations. We included trials that met the following 
criteria: 

• Trial reporting the effect on mean ejection fraction of 
infusion of autologous stem cells derived from bone 
marrow in patients with acute or estabhshed cardiac disease 

• At least one publication by the authors described it as 
randomised 

• Available through our institution and in a language 
understood well by at least one investigator. 

For each trial identified by this method, we used the international 
standard randomised controlled trial number registry (isrctn.org), 
ChnicalTrials.gov registry, PubMed, Google, and manual 
evaluation of references to search for other reports from that 
trial published until end of April 2013. 

Data extraction 

Two authors (ANN and SJ) extracted data from each trial, with 
disputes resolved by a third author (MJS). When the ejection 
fraction was measured by more than one imaging technique 
(magnetic resonance imaging (MRI), echocardiography, 
radionuclide imaging, left ventriculography), we used the data 
from the technique specified as the primary endpoint. If this 
was not defined, we used the technique that was highlighted in 
the abstract or (if the abstract was not specific) given priority 
in the conclusion or (if not mentioned in either) given priority 
in the results. When the ejection fraction effect size was reported 
at multiple time points, we used that of the longest follow-up. 

We defined the ejection fraction effect size*" as the change in 
ejection fraction in the active arm minus the change in ejection 
fraction in control arm. We used this if it was stated directly in 
the trial. If it was not directly stated, we calculated it from the 
changes provided in each arm or, when these were not provided, 
from the baseline and follow-up values in each arm. 

We used the standard error of the effect size if it was stated 
explicitly in the trial. If the confidence interval of the effect size 
was given instead, we extracted the standard error. If only the 
standard errors or standard deviations or confidence intervals 
of the changes in each arm were provided, we then extracted 
the two standard deviations and sample sizes and used them to 
calculate the standard error of the estimate. 



Detection of discrepancies 

The trials were then examined for discrepancies, which were 
categorised into the following three types^: 

1 . Discrepancies in the design — for example, conflicting 
statements as to whether the study was randomised (tabulated 
in appendix 1) 

2. Discrepancies in methods and baseUne characteristics — for 
example, sample or subgroup sizes that could not be an 
integer number of patients (hsted in appendix 2) 

3. Discrepancies in results — for example, conflicts between 
tables and figures or impossible values (listed in appendix 
3). 

Eight authors (ANN, DPF, GDC, HD, JPH, MM, MJS, SJ) read 
all the reports, except those of the four trials for which the 
discrepancies had already been found and published.^ Proposed 
discrepancies were discussed. A discrepancy was declared valid 
for inclusion in the study only if no member of the group could 
find a valid explanation. 

Contradictions in numerical values were considered as 
discrepancies but errors in spelling or grammar were not. If the 
same conflicting statements appeared more than once (for 
example, a trial repeatedly described as randomised in one 
publication, and repeatedly described as accepter-rejecter in 
another), this was considered a single discrepancy. 

Trials, and their reports, were coded with a "t" number or "r" 
number, respectively. Appendix 4 provides a decoded list of 
trials and reports with web hnks to the sources. Each discrepancy 
was numbered with a three digit code after the two digit "t" 
number. The first digit of the discrepancy code was allocated 
according to the type of discrepancy — that is, 1, 2, or 3, as hsted 
above. 

Of the trials, four had already undergone this process of 
identification and checking of discrepancies therefore for these 
we used the discrepancies as previously published by our group' 
(t07, t08, t21, and t49). Because the present publication focuses 
on counting discrepancies, where our previous report' listed 
more than one discrepancy on a single row, we now separated 
these on to individual rows. 

Assessing risk of bias 

Risk of bias in the included trials was assessed with the 
Cochrane Collaboration's risk of bias assessment tool (see 
appendix 5). Each trial was assessed by two independent 
observers and any differences resolved by a third observer. 

Data analysis 

We visualised the relation between the ejection fraction effect 
size and the discrepancy count with a scatter plot and quantified 
it with Spearman's rank correlation coefficient. It was further 
visualised with a histogram with trials grouped by the number 
of discrepancies in intervals of 10. Means were weighted by 
sample size. We constructed a funnel plot of the data and used 
Egger's test to assess asymmetry.' 

A meta-analysis was conducted for trials that provided sufficient 
data to weight the effect size estimates by a function of the 
reciprocal of the square of the standard error of the effect size 
estimate. We pooled the data on ejection fraction effect size for 
this subset of trials using a random effects model and present 
them as weighted mean differences with 95% confidence 
intervals. 

As part of an exploratory analysis we performed univariate and 
multivariate linear regression analyses including the discrepancy 
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count, sample size, and the five specific domains of the 
Cochrane risk of bias tool as predictor variables for the effect 
size. Any aspect of bias that was agreed to be "unclear" was 
treated as a "no." The multivariate model was built by stepwise 
backwards selection based on the Akaike information criterion.** 

Data analysis was carried out with R (version 3.0.2, R 
Foundation),' the graphics package ggplot2 (version 0.9.0),'° 
and the meta-analysis package metafor (version 1.9). 

Results 

Figure 1 shows how we identified trials|l. We identified 49 
randomised trials reporting the effect of bone marrow stem cells 
on ejection fraction for cardiac disease (appendix 6). Length of 
follow-up ranged from three months to 65 months, with modal 
duration of six months. Each trial had between one and 13 
reports. There were 133 reports in total (see appendix 4 for the 
list of trials and reports). We identified one study during the 
search in which the focus was on safety alone (appendix 7). 

Discrepancies in design, methiods, or results 

We identified 604 instances of discrepancy (appendices 1-3) 
within a trial report or between reports of that trial. We identified 
44 discrepancies in the reports of the study on safety. There 
were many types of discrepancy, as shown by the examples in 
table H|. Table 2 shows examples from the trials of phenomena 
that are unusual but were not counted as discrepanciesjj. 

Many aspects of the reports contained discrepancies. Even the 
primary endpoint was not spared (t09/308, tl4/301, tl9/305). 
Sometimes the discrepancies seemed to affect whether the 
difference between trial arms was significant (tlO/301). Effect 
size, defined as the increment in ejection fraction from bone 
marrow stem cell therapy, ranged from -3.9 to 14 percentage 
units. Numbers of discrepancies in individual trials across all 
their reports ranged from 0 to 89. 

There was a significant correlation between the number of 
discrepancies and the reported ejection fraction effect size 
(Spearman's r=0.4, P=0.005, fig 2[l). There were only five 
studies with no discrepancies, and these showed a mean effect 
size of -0.4%, with this average weighted by sample size. The 
24 trials with one to 10 discrepancies showed mean effect size 
of 2.1%; the 12 with 1 1 to 20 discrepancies showed mean effect 
size of 3.0%; the three with 21-30 discrepancies showed mean 
effect size of 5.7%; and five high discrepancy trials, with over 
30 discrepancies each, showed a mean effect size of 7.7% (fig 

Publication bias 

The funnel plot (appendix 8) did not show significant asymmetry 
(Egger's test P=0.4) that would suggest publication bias. 

Discrepancies and risk of bias 

The results of the exploratory univariate and multivariate 
analyses are presented in appendix 9. Only the number of 
discrepancies (P<0.001) and sequence generation (P=0.03) 
remained significant contributors to the effect size (adjusted 
R=0.38,P<0.001). 

Meta-analysis of studies providing 
information on uncertainty of effect size 

We could adequately extract the standard error of the effect size 
estimate in only 3 1 trials to allow a formal meta-analysis to be 
conducted using this information (appendix 10). The weighted 



mean effect size was 0.0 (95% confidence interval -4.67 to 
4.65) for trials with no discrepancies; 1 .9 (0.30 to 3.57) for trials 
with 1-10 discrepancies; 4.6 (1.64 to 7.61) for trials with 1 1-20 
discrepancies; 4.4 (-0.97 to 9.75) for trials with 21-30 
discrepancies; and 10.4 (8.44 to 12.36) for trials with more than 
30 discrepancies. 

Discussion 

Whenever we present scientific information we risk introducing 
conflicting statements that form discrepancies. Our study shows 
that scientists who achieve progressively better consistency of 
reporting find progressively smaller effects on ejection fraction 
of treatment with bone marrow stem cells. In trials with a 
discrepancy count of zero, the ejection fraction effect seems to 
be zero. 

Study limitations 

We were unable to blind ourselves to effect size because this 
was embedded within the report itself There might be additional 
unidentified discrepancies. Our work involved developing newly 
derived mathematical limits on what is possible (appendix 1 1). 
There could be other such limits that have not yet been 
established. We invite readers to contribute either new 
discrepancies or new general methods for identifying the 
impossible. 

Our method of counting discrepancies is imperfect because there 
is no universally accepted convention. We have tried to be 
consistent (appendices 1-3) but are open to suggestions from 
readers. Some readers might consider what we list as a single 
discrepancy to be several (for example, multiple repetitions of 
the same contradiction) or what we consider several to be just 
one (arguing, for example, that multiple discrepancies in a table 
might have been values from a different trial pasted into a 
manuscript accidentally). 

We have taken a simple approach of including all trials, 
including some that we have previously identified as containing 
discrepancies and that showed a large effect of stem cells.' With 
exclusion of these four trials. Spearman's rank correlation 
coefficient in figure 2 is lower at 0.3 (P=0.03) but similar. We 
were able to examine only studies whose results have been 
reported. We examined clinical trial registries but many trials 
seem to have sped through to publication, with the registration 
step skipped. 

In some cases it was not clear whether a trial was randomised. 
Previous meta-analysts have handled the confusion in 
contradictory ways, with some trials being classified as 
randomised in one meta-analysis and non-randomised in 
others.' Our policy considered a trial eligible if an author 
of the primary report stated at any stage that the trial was 
randomised. We recognise that other conventions for inclusion 
would also have been possible. This will remain a challenge as 
long as primary authors find the distinction puzzling. 

Our main analysis is weighted simply by sample size because 
this was available in every trial. In formal meta-analysis it is 
ideal to weight by a function of the reciprocal of the square of 
the standard error of the effect size estimate, but more than a 
third of the trials did not provide this information. For those 
trials that did provide sufficient data to weight the effect size 
estimates in this way, we conducted a formal meta-analysis 
(shown in appendix 10). 

We have not attempted to control for the fact that some trials 
issued more reports than others, and different reports gave 
information to different levels of depth. It is difficult to 
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satisfactorily control for this because sometimes the multiple 
reports are pure duplication, sometimes they cover 
non-overlapping information, and sometimes they are 
contradictory. 

We excluded five reports because they were in Chinese language 
journals to which we had no access. " We do not know to what 
extent this access limitation might have biased our results. 
Although it might have been possible to use additional routes 
to obtain the full text, and then arrange a translation, the 
translation process could always be suspected as a source of 
imperfection. 

Results in context of other similar work 

There seem to be no other studies exploring the relation between 
effect size of bone marrow stem cell therapy and the number of 
discrepancies in the reports. Several meta-analyses have covered 
bone marrow stem cell therapy for increasing ejection fraction 
but have not discussed the discrepancies in the reports. They 
all concluded that the average effect was a significant increase 
in ejection fraction. ' ^ This was reiterated in a Cochrane review** 
and the recent MESS (meta-analysis of cardiac stem cell studies) 
meta-analysis."* 

Our findings expand on these meta-analyses. Our study concurs 
that viewing all the studies together as a single entity, there is 
on average a positive effect on ejection fraction. However, we 
found that the positivity was not consistent across the spectrum 
of discrepancy count. The studies with the most discrepancies 
seem to be contributing most to the positivity, while the studies 
with no discrepancies show a zero effect. Averaging effect size 
across all studies might therefore not be wise because it does 
not reflect their varying factual accuracy. 

Standard meta-analyses include quality assessment, but this 
does not seem to involve identifying or quantifying factual 
discrepancies."^ Well conducted meta-analyses have somehow 
classed many studies with numerous discrepancies as high 
quality.'""' 

Discrepancy count seems additive to a traditional assessment 
of risk of bias (appendix 9). If our findings are verified by other 
workers in other specialties, then addition of discrepancy 
checking, and ideally cross checking with raw data, might make 
meta-analysis more illuminating. 

Possible explanations 

We do not know the cause of the discrepancies. We have asked 
for resolution of over 150 discrepancies through journals.^ None 
were resolved, although we found it triggered correspondence 
from lawyers. 

One possibility is that authors might feel pressure for results to 
match expectations. One signal of a misguided desire to please 
is the phenomenon of directed editing of rounded percentages 
to force them to add up to 100%. In reality, correctly rounded 
percentages should often not add up to 100% when there are 
many categories."** The effect of even a little bias can be 
surprisingly dramatic."' ^" 

Secondly, exciting new treatments might be reported before full 
checking. One sign of this, in the neighbouring specialty of 
cardiomyocyte-derived stem cell therapy, is the insertion of the 
word "randomised" into the title of the journal publication^' that 
was not present in the manuscript finalised by the authors^' on 
Pubmed Central. There were seven controls in total, but after 
subtraction of the four who were not randomised and one who 
was randomised to stem cells but refused treatment, the number 



of randomised controls was only two. For the Lancet this was 
a new low. 

Thirdly, bone marrow stem cell therapy might be less effective 
when it is carried out in a rigidly standardised way. Centres 
with less attention to detail might incorporate an unnoticed 
contaminant that enhances the effect of treatment. These centres 
might produce reports with more discrepancies. In support of 
this, just over a fifth of the trials (t07, t08, t09, tl 1, t21, t27, t33, 
t35, t40, t43, t44, t49) showed ejection fraction effects of 7% 
or more, but these trials accounted for more than half of all the 
discrepancies. 

The final possibility is that in the reports with the fewest 
discrepancies, the ejection fraction effect might also have been 
measured with least error. If so, the true effect of bone marrow 
stem cells on ejection fraction is zero. 

Implications for correctness of values 
reported in trials 

When trials provide full data, serious errors in reporting can 
come to light, such as omitted patients,'"' reclassification of 
causes of death,'* '^ or studies based on fictitious data."" '^ As 
full data disclosure is rare, readers currently cannot estimate 
how many trial reports are incorrect. It is essential that there is 
open access to data.'* 

In our study, the reported standard deviation of the N YHA score 
(New York Heart Association classification for chronic heart 
failure) offers a unique window into correctness of reporting, 
which does not require raw data. If NYHA data for individual 
patients are fabricated, then the means and standard deviations 
will remain mathematically possible. 

It is only when standard deviations are not correctly calculated 
from real NYHA values that mathematically impossible values 
can arise. Of 1 1 trials reporting a standard deviation of NYHA, 
the values in five (45%) are mathematically impossible. 

The NYHA score is simple to measure, and the standard 
deviation is simple to calculate. Ejection fraction effect is more 
complex to measure and its statistical significance is more 
complex to calculate. We are concerned that, if simple 
calculations on simple variables are definitely incorrect in almost 
half of trials, then the more subtle statistical statements regarding 
more subtle variables might in most cases also be incorrect. 

Implications for interpreting trial design 

Readers need to know whether a trial is randomised or not, but 
the reports were sometimes vague or even contradictory. Some 
trials were initially reported as accepter-rejecter 
(non-randomised) and later as randomised' (t21, t41, t49). In 
one, a later publication recalled the existence of a placebo 
control group (t07r3). In this specialty, patients' voluntary choice 
is sometimes considered a form of randomisation, a policy 
accepted by some journals (t21/102).'* " Identical tables and 
identical figures have inexplicably been presented as results of 
different studies,' with different names, different designs 
(randomised versus accepter-rejecter), and different sample 
sizes (t07rl, t07rl0). 

Journals could resolve such discrepancies but currently do not 
consider this a priority' *" (t07rl, t07rl0, t21r5, t21rl 1, t49rl, 
t49r3). 

Implications for safety of bone marrow stem 
cell therapy 

The safety of bone marrow stem cell therapy is underlined by 
a large report focusing on this.'""'*'' Unfortunately it too contains 
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many discrepancies (appendix 7), including impossible 
percentages and conflicts between tables and figures, perhaps 
because the reassuring findings had to be made available with 
urgency. 

Implications for clinicians and researchers 

If patients ask for advice on which bone marrow stem cell trial 
to enter, we want to maximise the benefit to them, while 
maximising their contribution to reliable evidence for future 
patients. Unfortunately, these seem to be in conflict (fig 3JJ.). 

Sometimes researchers feel that only findings of positive effects 
indicate scientific success. But meticulously reported studies 
reporting neutral effects are vital contributions to science. It is 
more valuable to have a reliable report of a small improvement 
than an unreliable report of a large improvement. Error-free 
reporting is difficult to achieve, as we have found in our own 
experience.^^ Only 10% of these trials were reported without 
introducing discrepancies. We consider these to be the greatest 
scientific successes, even though the effect size was 
unfortunately zero. 

Several lessons can be drawn for the design of future trials of 
bone marrow stem cells. Prior registration on a public clinical 
trial registry was not universal and would have been helpful in 
distinguishing unambiguously between trials that were multiply 
published or merely identical by coincidence. 

We recommend that reports include a spreadsheet of all the data 
used for construction of the tables, so that incorrect values could 
be more easily identified. Readers should accept that authors 
cannot avoid errors; in turn authors should correct errors 
promptly and indicate clearly when later reports incorporate 
corrections. It should be remembered that the 604 discrepancies 
listed in appendices 1-3 are unlikely to be all the errors. They 
are only those detectable by us without any information beyond 
the published reports. Disclosing the individual patient data 
could help to correct more errors. 

It is important for studies using change in ejection fraction as 
an endpoint to be properly designed to resist error and to have 
adequate sample size to combat the effects of biological 
variability. Left ventricular ejection fraction is a mutable 
variable, which in some modalities is easily manipulated 
innocently by clinicians who have prior beliefs on what a 
realistic value should be for a particular patient. Sample size 
planning can sometimes be erroneously omitted when clinicians 
are enthusiastic to "demonstrate the effectiveness" of a treatment 
seen as exciting. 

Conclusions 

It is difficult to avoid discrepancies in clinical trial reports. Trials 
with progressively fewer discrepancies tend to find progressively 
smaller effects on ejection fraction of bone marrow stem cell 
therapy. The reason for this association is unknown. The few 
trials for which the discrepancy count was zero had a stem cell 
effect size that was also zero. 

Notes added at proof stage: The institution of t07, t08, t21 , and t49 is 
recently reported to have identified evidence of misconduct"" and has 
notified the city prosecutor. The institution of the SCI PIO triaP' is recently 
reported to have requested that the publication be retracted." 
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Tables 



Table 1| Spectrum of discrepancies in publislied reports of autologous bone marrow stem cell trials and enhancement of ejection fraction 


Discrepancy 


Examples 


Studies that are reported by authors to be both 
accepter-rejecter and randomised 


Authors published data as accepter-rejecter, and then later described similar (t41 /1 01 ) or same (t07/1 01 ) 
data as randomised (108/101, 121/102, t49/103) 


Impossible numbers of patients, percentages, sums, or 
summary statistics 


1 00% of 6 patients referred to as 5 patients (tOI /205) or 50% of 9 patients on a drug (138/208). 20 patients 
in 3 groups whose sizes add up to 1 9 (t1 6/201 ). Standard deviation (SD) impossibly wide (143/308) or 
mathematically impossible (134/301) (explanation in appendix 11). Large identical changes in mean in 
both recipients and control with no change in SD (121/206) 


Sex reclassification 


Women present in early reports seem to have become men by later reports (101/202, 121/203) 


Zero or negative NYHA class (which can only be 1, il, iil, 
or IV) 


NYHA class of 0 postoperatively, only in stem cell recipients, giving mean of 0.7 (135/302). One patient 
seems to have NYHA of -5 (107/334) 


Indicating non-significant differences as significant, or 
significant differences as non-significant (actively or by 
omission) 


Mean values of 2.38 (SEM 0.26) and 2.2 (SEM 0.20) reported as significantly different, but f test result 
not significant (122/301 ). Groups that differ with P<0.001 described as comparable (107/305, 149/202). 
Readers not informed of highly significant changes in control group, which if calculated show P values 
as low as <1x 10-'°° (107/352) 


Conflicts in protocol or foliow-up 


Patients who died or were lost to foliow-up were still taking drugs, reporting symptoms, and undergoing 
tests (141/201 , 107/357, 107/358, 107/359, 107/360). Discrepancy over whether controls had sham injection 
or how injection could reach stated position (135/201 , 135/202). Of 41 patients, at 3 years, 12 had died 
or 1 0 had died, or perhaps none had died since all 41 reported their NYHA class at 3 years (141/301 , 
141/302, 141/303) 


Fiddly figures: contradictions 


Conflicts between figures and numerical data (140/302, 129/305). Measurement spread increases but SD 
shrinks (146/301 ), or SD bars vary but SD stays same (140/302, 140/303, 140/304, 140/305). More patients 
on graph showing individuals' EFs than were supposed to be in study (142/302). Conflicts between tables 
in numbers of patients (128/301) or means and SDs (127/307) 


Principal report is of significant effect, subsequent report EF effect of +7.1 ("P=0.05"), but assembling effects in two subgroups shows overall effect of +6.5 (P>0.05) 
(presumably a correction) shows effect had been smaller (t1 2/301 ) 
and non-significant 


NYHA=New York Heart Association; SEM=standard error of mean; EF=ejection fraction. 
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Table 7| Unusual phenomena not listed as discrepancies in published reports of autologous bone marrow stem cell trials and enhancement 
of ejection fraction 



Phenomenon 

Randomised results presented intermingled with 
non-randomised 

Vanishing SPECT EF 



Reversal of NYHA-mortality link 



Example 

75 patients randomised between control and stem cells but results shown only as averages that include 
another 17 from uncontrolled cohort (t15r1). Significant stem cell effect was seen 

All patients had radionuclide SPECT LVEF, but results not shown and indicated to be "similar trends" to 
echocardiographic EF, which showed significant stem cell effect (t42r1 ). In another trial all had radionuclide 
LVEF but results not shown; instead MRI LVEF substudy results are shown (t25r2). Significant stem cell 
effect was seen 



Of patients with NYHA III and IV, within one year almost all the NYHA I 
NYHA IV survived (t34r3) 



patients died and almost all the 



Extraordinarily narrowly distributed EEs during 
follow-up 

Delayed recollection of lacl< of blinding 



Given natural test-retest variability of EF in single individual, distribution of measurements across patients 
should be substantially wider, but this is not always reflected in trial data (t33r1 ). Large stem cell effect was 
seen 

One study initially published as double blinded, but subsequently authors issued corrigendum whose only 
effect was removal of words "double-blindedly" (t44r1 ). Large stem cell effect was seen 

Controls received intracoronary injection of cell culture Control subjects received infusion into coronary arteries of cell culture medium X-VIVO 1 0 (t1 7r4) "designed 
medium not licensed for use in humans to support the generation of Lymphokine Activated Killer cells"; manufacturer warns that it is "not approved 

for human or veterinary use, or for application to humans or animals."" Significant stem cell effect was 

seen 



Unconventional informed consent process for 
randomised controlled trial 

St Ives syndrome 



Balls 



Consent for randomisation obtained from relatives rather than patients (t33r1 ), or not described at all (t40r2, 
t40r3, t44r1) 

Each patient can have many treatment episodes, each episode can be counted in more than one trial, each 
trial can have more than one report (with different names), each report can appear in more than one 
meta-analysis, and sum of patient counts from all meta-analysis can be totalled up, producing multiple levels 
of multiple counting.'"" (t49r1). Large stem cell effect was seen 

One study reported randomisation "using a nonparticipant in the study to pick a red ball ... or blue ball," 
which seems insufficient guarantee of bias-resistance (t35r1). Large stem cell effect was seen 



SPECT=single photon emission computed tomographyD; LVEF=left ventricular ejection fraction; NYHA=New York Heart Association; SEM=standard error of mean; 
EF=ejection fraction; MRI=magnetic resonance Imaging. 



No commercial reuse: See rights and reprints http://www.bmj.com/permissions Subscribe: http://www.bmj.com/subscribe 



e/WJ2014;348:g2688 doi: 10.1136/bmj.g2688 (Published 29 April 2014) 



Page 9 of 9 



RESEARCH 



Figures 



Records identified througii Medline 
and Embase search (n=2016) 



Records identified from other sources (n=4) 



Records screened (n=2020) ] 
Excluded by title and/or abstract (n=1924): 

(published afterApril 2013, duplicates, not clinical trials, not in adult humans aged 
>18 years, not available in English or German, not randomised or not autologous bone 
marrow stem cells against control, EF not measured) 

Full texts assessed (n=96) j 

Excluded by full text (n=47): 
Duplicates (n=18) 

Not autologous bone marrow stem cells against control (n=18) 
Mean change in EF not available (n=8) 
Not randomised (n=3) 

Individual trials included (n=49) 



Fig 1 Identification of randomised controlled trials of autologous bone marrow stem cells for heart disease (EF=ejection 
fraction) 
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Fig 2 Correlation between number of discrepancies in trial's reports and ejection fraction (EF) effect size. Dot area is 
proportional to trial's sample size (Spearman's r=0.4, P=0.005) 
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Fig 3 Mean ejection fraction (EF) effect size by number of discrepancies in trial's reports. Error bars here show only SE of 
mean effect size weighted for sample size across trials in each category. Formal meta-analytic confidence intervals, which 
fully integrate sample size and uncertainty within each trial, are available only for subset of trials (see appendix 10) 
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